X/Twitter Intelligence Scraper Pro avatar

X/Twitter Intelligence Scraper Pro

Pricing

from $1.00 / 1,000 results

Go to Apify Store
X/Twitter Intelligence Scraper Pro

X/Twitter Intelligence Scraper Pro

Scrape public X/Twitter posts and profiles from search terms, handles, tweet URLs, and lists. Export clean tweet, author, media, engagement, and monitoring data for research, marketing, and social listening.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Muhammad Qaseem Iqbal

Muhammad Qaseem Iqbal

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

X/Twitter Intelligence Scraper Pro πŸš€

Collect public X/Twitter data for monitoring, research, reporting, dashboards, spreadsheets, and AI workflows.

This actor can scrape public search results, profile timelines, tweet URLs, lists, conversations, and recurring monitoring runs. It is designed to start cheaply by default, then use a browser only when X/Twitter blocks simple page access or does not return enough data.

⚠️ Important: X/Twitter often changes how public pages load. Cheap HTTP-only runs may return warning records instead of tweets when X serves a generic page. For more reliable results, use http_first with browserFallbackOnEmpty: true.

TL;DR ⚑

Want to scrape tweets from a profile and get real results reliably? Start here:

{
"scrapeMode": "profiles",
"twitterHandles": ["NASA"],
"maxItems": 10,
"maxItemsPerProfile": 10,
"crawlStrategy": "http_first",
"browserFallbackOnEmpty": true,
"maxConcurrency": 1,
"downloadMedia": false
}

Want the cheapest possible test run first?

{
"scrapeMode": "search",
"searchTerms": ["from:NASA lang:en"],
"maxItems": 10,
"crawlStrategy": "http_only",
"downloadMedia": false
}

If the cheapest run returns X_ACCESS_WARNING or NO_TWEETS_FOUND, switch to the first example with browser fallback enabled.

What You Can Collect πŸ“₯

Data sourceWhat to enterExample
Search resultsSearch terms or advanced X/Twitter queriesfrom:NASA lang:en
Profile timelinesOne or more handlesNASA, OpenAI, apify
Tweet detailsTweet URLs or tweet IDshttps://x.com/NASA/status/...
Start URLsProfile, tweet, list, or search URLshttps://x.com/NASA
ListsPublic X/Twitter list URLshttps://x.com/i/lists/...
ConversationsTweet IDs or tweet URLsUse when you want replies/context
MonitoringRepeated search/profile runsEmit only newly seen tweets
User discoveryAuthor aggregation from matching tweetsUseful for audience research

Common Use Cases 🎯

  • Track mentions of a brand, product, event, or person.
  • Collect posts from public profiles for research or reporting.
  • Monitor public conversations around keywords or hashtags.
  • Build datasets for dashboards, spreadsheets, or BI tools.
  • Prepare clean tweet text and metadata for AI search, chatbots, and RAG workflows. RAG means retrieval-augmented generation, a common way to give AI apps source data to search.
  • Watch for new posts over time with persistent monitoring.
  • Export structured tweet records to Apify datasets.

How It Works πŸ› οΈ

  1. Choose a scrape mode, such as search, profiles, URLs, tweet details, conversations, or monitoring.
  2. Add search terms, handles, tweet IDs, or X/Twitter URLs.
  3. Set maxItems to control how many records you want.
  4. Choose a crawl strategy:
    • http_only: lowest cost, fastest, but may return warnings if X blocks public HTML.
    • http_first: tries the cheap method first, then can use a browser if enabled.
    • browser_only: highest extraction effort, usually higher cost.
  5. Run the actor and download your results from the dataset.
GoalRecommended settings
Cheapest testcrawlStrategy: "http_only", maxItems: 10, downloadMedia: false
More reliable profile scrapingcrawlStrategy: "http_first", browserFallbackOnEmpty: true
Keep costs lowUse maxConcurrency: 1, avoid media downloads, start with small maxItems
Get more search resultsTry sort: "top" or sort: "latest_and_top"
Avoid duplicate tweetsKeep deduplicateBy: "tweetId"
AI-ready outputEnable includeRagFields: true to add fields that are easier for AI apps to search
Spreadsheet-friendly outputEnable flattenOutput: true or choose outputFields

Example Inputs πŸ§ͺ

Scrape a Profile πŸ‘€

{
"scrapeMode": "profiles",
"twitterHandles": ["NASA"],
"maxItems": 25,
"maxItemsPerProfile": 25,
"crawlStrategy": "http_first",
"browserFallbackOnEmpty": true
}

Search for Tweets πŸ”Ž

{
"scrapeMode": "search",
"searchTerms": ["\"artificial intelligence\" lang:en -filter:retweets"],
"maxItems": 100,
"sort": "latest_and_top",
"crawlStrategy": "http_first",
"browserFallbackOnEmpty": true
}

Use a Date Range πŸ“…

{
"scrapeMode": "search",
"searchTerms": ["from:NASA since:2026-01-01 until:2026-06-01"],
"maxItems": 200,
"sort": "top",
"dateSplitStrategy": "monthly"
}

Mix Handles, URLs, and Search Terms 🧩

{
"scrapeMode": "auto",
"searchTerms": ["@apify lang:en"],
"twitterHandles": ["NASA"],
"startUrls": [{ "url": "https://x.com/OpenAI" }],
"tweetIds": ["1728108619189874825"],
"maxItems": 100
}

Monitor New Results Over Time πŸ””

{
"scrapeMode": "monitoring",
"searchTerms": ["\"launch announcement\" lang:en"],
"maxItems": 50,
"monitoring": {
"enabled": true,
"stateKey": "launch-monitor",
"emitOnlyNewItems": true,
"lookbackHours": 24
},
"deduplicateScope": "persistent"
}

Prepare Data for AI Search or Chatbots πŸ€–

{
"scrapeMode": "profiles",
"twitterHandles": ["NASA"],
"maxItems": 50,
"includeRagFields": true,
"cleanText": true,
"includeRawData": false
}

Main Input Options βš™οΈ

FieldPlain-English meaningDefault
scrapeModeWhat kind of run to perform. Use auto if you are mixing inputs.auto
searchTermsSearch queries to run on X/Twitter. Supports advanced search syntax.[]
twitterHandlesPublic profile handles to scrape. @ is optional.[]
startUrlsPublic X/Twitter URLs, including profiles, tweets, lists, and searches.[]
tweetIdsTweet IDs to fetch directly.[]
maxItemsMaximum number of output records for the run.100
maxItemsPerQueryMaximum records per search query, URL, or list.100
maxItemsPerProfileMaximum records per profile timeline.100
sortSearch order: latest, top, both, or automatic fallback.latest
profileModeWhich profile tab to use, such as tweets, replies, or media.tweets
filtersOptional filters for language, engagement, media, links, replies, and retweets.{}
crawlStrategyHow the actor loads pages: cheap HTTP, HTTP first, or browser only.http_only
browserFallbackOnEmptyUse a browser if the cheap request returns no tweets or an access warning.false
downloadMediaDownload images/videos to storage instead of only collecting metadata.false
flattenOutputMake nested fields easier to use in CSV/spreadsheets.false
outputFieldsKeep only selected fields in the final output.[]
includeRagFieldsAdd AI-friendly text chunks and metadata for search/chatbot workflows.false
monitoringSave state between runs and emit only new items.disabled

Output πŸ“¦

Results are saved to the Apify dataset. Most successful records are tweet records.

Example tweet record:

{
"recordType": "tweet",
"tweetId": "2064422103416238295",
"url": "https://x.com/NASA/status/2064422103416238295",
"text": "Pinned NASA @NASA Jun 9 Introducing Artemis III...",
"cleanText": "Pinned NASA @NASA Jun 9 Introducing Artemis III...",
"author": {
"userName": "NASA",
"url": "https://x.com/NASA"
},
"isReply": false,
"isRetweet": false,
"isQuote": false,
"discovery": {
"inputType": "twitterHandles",
"handle": "NASA"
},
"scrapedAt": "2026-06-14T15:40:03.212Z"
}

Depending on the page and settings, records may include:

  • Tweet ID and URL
  • Tweet text and cleaned text
  • Author handle, name, URL, and profile details when available
  • Creation time and language when available
  • Reply, repost, quote, like, bookmark, and view counts when available
  • Media metadata when available
  • Discovery metadata showing which input produced the record
  • Optional AI/RAG fields for search and chatbot workflows
  • Optional raw data if includeRawData is enabled

Warning and Error Records ⚠️

When X/Twitter does not return usable public tweet data, the actor writes a clear warning record instead of silently failing.

CodeWhat it meansWhat to try
X_ACCESS_WARNINGX returned a generic or restricted page.Use http_first with browserFallbackOnEmpty: true.
NO_TWEETS_FOUNDThe page loaded, but no public tweets were found.Try a broader query, sort: "top", or browser fallback.
HTTP_FETCH_WARNINGThe cheap HTTP request failed.Retry with browser fallback or Apify Proxy.
REQUEST_FAILEDA browser request failed after retries.Lower concurrency, raise timeout, or try a smaller run.

Tips for Better Results βœ…

  • Test with maxItems: 10 before running a larger job.
  • If a search query returns few results, try sort: "top" or sort: "latest_and_top".
  • If your query uses until, try removing it or using smaller date windows.
  • Use dateSplitStrategy for long historical searches.
  • Keep maxConcurrency low for X/Twitter pages.
  • Enable browser fallback when HTTP-only runs return warning records.
  • Use downloadMedia: false unless you really need downloaded files.
  • Use outputFields if you only need a few columns.

Cost Notes πŸ’Έ

This actor is built with cost control in mind:

  • It starts with http_only, the cheapest crawl strategy.
  • It uses one concurrent request by default.
  • It does not retry failed requests by default.
  • It does not download media by default.
  • It caps output with maxItems.

Browser fallback is more reliable, but it costs more because it launches a real browser. Use it when you need results and HTTP-only mode returns access warnings.

Limitations 🧭

  • This actor only collects public data.
  • It does not access protected/private accounts.
  • It does not require or store X/Twitter login credentials.
  • X/Twitter may hide, rate-limit, or change public pages at any time.
  • Some fields are best-effort because X may not expose them on every page.
  • Media downloading can increase runtime and storage usage.
  • Conversation and search behavior depends on what X/Twitter publicly serves at run time.

Troubleshooting πŸ”§

I got X_ACCESS_WARNING ⚠️

X likely returned a generic page instead of tweet data. Switch to:

{
"crawlStrategy": "http_first",
"browserFallbackOnEmpty": true
}

I got NO_TWEETS_FOUND πŸ”

Try a less restrictive query, a public profile with recent posts, sort: "top", or browser fallback.

I got fewer results than expected πŸ“‰

Check maxItems, maxItemsPerQuery, and maxItemsPerProfile. Also remember that X/Twitter may show different results depending on search mode, date range, location, and whether the page is loaded in a browser.

My run is too expensive πŸ’Έ

Lower maxItems, keep downloadMedia disabled, use maxConcurrency: 1, and start with http_only. Use browser fallback only when needed.

I see duplicate tweets ♻️

Keep deduplicateBy: "tweetId". If you use latest_and_top, the same tweet can appear in both searches, so deduplication is recommended.

For Developers πŸ§‘β€πŸ’»

Run locally:

npm install
npm test
npm run build
npm run dev

Deploy to Apify:

$apify push

Support πŸ™‹

If results look wrong, include the run ID, input JSON, and a short description of what you expected. The run summary and dataset warning records usually show whether the issue came from input settings, public X/Twitter access limits, or page changes.